15 research outputs found

    De novo Transcriptome Assemblies of Rana (Lithobates) catesbeiana and Xenopus laevis Tadpole Livers for Comparative Genomics without Reference Genomes

    Get PDF
    In this work we studied the liver transcriptomes of two frog species, the American bullfrog (Rana (Lithobates) catesbeiana) and the African clawed frog (Xenopus laevis). We used high throughput RNA sequencing (RNA-seq) data to assemble and annotate these transcriptomes, and compared how their baseline expression profiles change when tadpoles of the two species are exposed to thyroid hormone. We generated more than 1.5 billion RNA-seq reads in total for the two species under two conditions as treatment/control pairs. We de novo assembled these reads using Trans-ABySS to reconstruct reference transcriptomes, obtaining over 350,000 and 130,000 putative transcripts for R. catesbeiana and X. laevis, respectively. Using available genomics resources for X. laevis, we annotated over 97% of our X. laevis transcriptome contigs, demonstrating the utility and efficacy of our methodology. Leveraging this validated analysis pipeline, we also annotated the assembled R. catesbeiana transcriptome. We used the expression profiles of the annotated genes of the two species to examine the similarities and differences between the tadpole liver transcriptomes. We also compared the gene ontology terms of expressed genes to measure how the animals react to a challenge by thyroid hormone. Our study reports three main conclusions. First, de novo assembly of RNA-seq data is a powerful method for annotating and establishing transcriptomes of non-model organisms. Second, the liver transcriptomes of the two frog species, R. catesbeiana and X. laevis, show many common features, and the distribution of their gene ontology profiles are statistically indistinguishable. Third, although they broadly respond the same way to the presence of thyroid hormone in their environment, their receptor/signal transduction pathways display marked differences

    Large-scale machine learning-based phenotyping significantly improves genomic discovery for optic nerve head morphology.

    Get PDF
    Genome-wide association studies (GWASs) require accurate cohort phenotyping, but expert labeling can be costly, time intensive, and variable. Here, we develop a machine learning (ML) model to predict glaucomatous optic nerve head features from color fundus photographs. We used the model to predict vertical cup-to-disc ratio (VCDR), a diagnostic parameter and cardinal endophenotype for glaucoma, in 65,680 Europeans in the UK Biobank (UKB). A GWAS of ML-based VCDR identified 299 independent genome-wide significant (GWS; p ≤ 5 × 10-8) hits in 156 loci. The ML-based GWAS replicated 62 of 65 GWS loci from a recent VCDR GWAS in the UKB for which two ophthalmologists manually labeled images for 67,040 Europeans. The ML-based GWAS also identified 93 novel loci, significantly expanding our understanding of the genetic etiologies of glaucoma and VCDR. Pathway analyses support the biological significance of the novel hits to VCDR: select loci near genes involved in neuronal and synaptic biology or harboring variants are known to cause severe Mendelian ophthalmic disease. Finally, the ML-based GWAS results significantly improve polygenic prediction of VCDR and primary open-angle glaucoma in the independent EPIC-Norfolk cohort

    The Power of Duples (in Self-Assembly): It's Not So Hip To Be Square

    Full text link
    In this paper we define the Dupled abstract Tile Assembly Model (DaTAM), which is a slight extension to the abstract Tile Assembly Model (aTAM) that allows for not only the standard square tiles, but also "duple" tiles which are rectangles pre-formed by the joining of two square tiles. We show that the addition of duples allows for powerful behaviors of self-assembling systems at temperature 1, meaning systems which exclude the requirement of cooperative binding by tiles (i.e., the requirement that a tile must be able to bind to at least 2 tiles in an existing assembly if it is to attach). Cooperative binding is conjectured to be required in the standard aTAM for Turing universal computation and the efficient self-assembly of shapes, but we show that in the DaTAM these behaviors can in fact be exhibited at temperature 1. We then show that the DaTAM doesn't provide asymptotic improvements over the aTAM in its ability to efficiently build thin rectangles. Finally, we present a series of results which prove that the temperature-2 aTAM and temperature-1 DaTAM have mutually exclusive powers. That is, each is able to self-assemble shapes that the other can't, and each has systems which cannot be simulated by the other. Beyond being of purely theoretical interest, these results have practical motivation as duples have already proven to be useful in laboratory implementations of DNA-based tiles

    Integrating genomics and metabolomics for scalable non-ribosomal peptide discovery.

    Get PDF
    Non-Ribosomal Peptides (NRPs) represent a biomedically important class of natural products that include a multitude of antibiotics and other clinically used drugs. NRPs are not directly encoded in the genome but are instead produced by metabolic pathways encoded by biosynthetic gene clusters (BGCs). Since the existing genome mining tools predict many putative NRPs synthesized by a given BGC, it remains unclear which of these putative NRPs are correct and how to identify post-assembly modifications of amino acids in these NRPs in a blind mode, without knowing which modifications exist in the sample. To address this challenge, here we report NRPminer, a modification-tolerant tool for NRP discovery from large (meta)genomic and mass spectrometry datasets. We show that NRPminer is able to identify many NRPs from different environments, including four previously unreported NRP families from soil-associated microbes and NRPs from human microbiota. Furthermore, in this work we demonstrate the anti-parasitic activities and the structure of two of these NRP families using direct bioactivity screening and nuclear magnetic resonance spectrometry, illustrating the power of NRPminer for discovering bioactive NRPs

    The non-cooperative tile assembly model is not intrinsically universal or capable of bounded Turing machine simulation

    Get PDF
    The field of algorithmic self-assembly is concerned with the computational and expressive power of nanoscale self-assembling molecular systems. In the well-studied cooperative, or temperature 2, abstract tile assembly model it is known that there is a tile set to simulate any Turing machine and an intrinsically universal tile set that simulates the shapes and dynamics of any instance of the model, up to spatial rescaling. It has been an open question as to whether the seemingly simpler noncooperative, or temperature 1, model is capable of such behaviour. Here we show that this is not the case, by showing that there is no tile set in the noncooperative model that is intrinsically universal, nor one capable of time-bounded Turing machine simulation within a bounded region of the plane. Although the noncooperative model intuitively seems to lack the complexity and power of the cooperative model it has been exceedingly hard to prove this. One reason is that there have been few tools to analyse the structure of complicated paths in the plane. This paper provides a number of such tools. A second reason is that almost every obvious and small generalisation to the model (e.g. allowing error, 3D, non-square tiles, signals/wires on tiles, tiles that repel each other, parallel synchronous growth) endows it with great computational, and sometimes simulation, power. Our main results show that all of these generalisations provably increase computational and/or simulation power. Our results hold for both deterministic and nondeterministic noncooperative systems. Our first main result stands in stark contrast with the fact that for both the cooperative tile assembly model, and for 3D noncooperative tile assembly, there are respective intrinsically universal tilesets. Our second main result gives a new technique (reduction to simulation) for proving negative results about computation in tile assembly

    The program-size complexity of self-assembled paths

    Get PDF
    We prove a Pumping Lemma for the noncooperative abstract Tile Assembly Model, a model central to the theory of algorithmic self-assembly since the beginning of the field. This theory suggests, and our result proves, that small differences in the nature of adhesive bindings between abstract square molecules gives rise to vastly different expressive capabilities. In the cooperative abstract Tile Assembly Model, square tiles attach to each other using multi-sided cooperation of one, two or more sides. This precise control of tile binding is directly exploited for algorithmic tasks including growth of specified shapes using very few tile types, as well as simulation of Turing machines and even self-simulation of self-assembly systems. But are cooperative bindings required for these computational tasks? The definitionally simpler noncooperative (or Temperature 1) model has poor control over local binding events: tiles stick if they bind on at least one side. This has led to the conjecture that it is impossible for it to exhibit precisely controlled growth of computationally-defined shapes. Here, we prove such an impossibility result. We show that any planar noncooperative system that attempts to grow large algorithmically-controlled tile-efficient assemblies must also grow infinite non-algorithmic (pumped) structures with a simple closed-form description, or else suffer blocking of intended algorithmic structures. Our result holds for both directed and nondirected systems, and gives an explicit upper bound of (8|T|)4|T|+1(5|σ| + 6), where |T| is the size of the tileset and |σ| is the size of the seed assembly, beyond which any path of tiles is pumpable or blockable

    Integrating genomics and metabolomics for scalable non-ribosomal peptide discovery

    No full text
    Non-Ribosomal Peptides (NRPs) represent a biomedically important class of natural products that include a multitude of antibiotics and other clinically used drugs. NRPs are not directly encoded in the genome but are instead produced by metabolic pathways encoded by biosynthetic gene clusters (BGCs). Since the existing genome mining tools predict many putative NRPs synthesized by a given BGC, it remains unclear which of these putative NRPs are correct and how to identify post-assembly modifications of amino acids in these NRPs in a blind mode, without knowing which modifications exist in the sample. To address this challenge, here we report NRPminer, a modification-tolerant tool for NRP discovery from large (meta)genomic and mass spectrometry datasets. We show that NRPminer is able to identify many NRPs from different environments, including four previously unreported NRP families from soil-associated microbes and NRPs from human microbiota. Furthermore, in this work we demonstrate the anti-parasitic activities and the structure of two of these NRP families using direct bioactivity screening and nuclear magnetic resonance spectrometry, illustrating the power of NRPminer for discovering bioactive NRPs. Current genome mining methods predict many putative non-ribosomal peptides (NRPs) from their corresponding biosynthetic gene clusters, but it remains unclear which of those exist in nature and how to identify their post-assembly modifications. Here, the authors develop NRPminer, a modification-tolerant tool for the discovery of NRPs from large genomic and mass spectrometry datasets, and use it to find 180 NRPs from different environments
    corecore